# -*- mode: org; -*-
#+HTML_HEAD: <link rel="stylesheet" type="text/css" href="http://www.pirilampo.org/styles/readtheorg/css/htmlize.css"/>
#+HTML_HEAD: <link rel="stylesheet" type="text/css" href="http://www.pirilampo.org/styles/readtheorg/css/readtheorg.css"/>
#+TITLE: AWK

* AWK and GAWK
 /Dependencies:/ In order to learn awk this can be necessary for you to learn before:
    - Unix Pipes
    - Shell Basics
    - Data Streams
    - Unix-like Operative Systems 
 
+ Before we start learning grep please create the following file named ~animals.txt~
#+BEGIN_EXAMPLE 
Animals Quantity
Dogs    32
Cats    17
Birds   25
Cows    7
Ducks   9
Pigs   12
#+END_EXAMPLE

** Simple Description of AWK:
AWK is a programming language in UNIX, it is designed for text processing, but can be used for data extraction and reporting generation. As a programming language, the syntax of Awk would seem similar to the languages C, Python and Bash, among others.

(*Note* In this case GNU awk or GAWK will be covered, which is the [[https://www.gnu.org/philosophy/free-sw.html][GNU Project's]] implementation of AWK. *End Note*)

*** Typical uses of awk:
- Text processing.
- Data extraction.
- Formatted text reports.
- Arithmetic operations.
- String operations.
- Many more!.

** Basic example
The first program we will write is very simple, it will just print "hello world!":
#+BEGIN_EXAMPLE awk 
$ echo "hello world!" | awk {'print'}
hello world
#+END_EXAMPLE

*What just happened?*
/awk is a pattern matcher, in this case it got an input "hello world" and a pattern 'print', and awk applies pattern to each line of the input. That is why we see 'hello world' as a result./

As we can see awk is a language that process an input file, in a nutshell this is what awk does:

1. Gets input, a pattern to look for, and a rule to apply.
2. Reads input.
3. Looks first line of input looking for pattern.
4. If a pattern is matched, then rule is applied.
5. Moves to next line until End of File.

Of course, awk can get multiple patterns and rules, but all of them are applied sequentially, we can also change awk behaviour so it looks at patterns and not at lines. 


** Better examples
With awk we can print the content of a file with:
#+BEGIN_EXAMPLE awk 
$ awk '{print $0}' fruits.txt 
Animals Quantity
Dogs    32
Cats    17
Birds   25
Cows    7
Ducks   9
Pigs   12
#+END_EXAMPLE

We can also print only the first field of text with $1
#+BEGIN_EXAMPLE awk 
$ awk '{print $1}' fruits.txt 
Animals
Dogs
Cats
Birds
Cows
Ducks
Pigs
#+END_EXAMPLE

We can also print only the second field of text with $2
#+BEGIN_EXAMPLE awk 
$ awk '{print $2}' fruits.txt 
Quantity
32
17
25
7
9
12
#+END_EXAMPLE


*What just happened?*
/The "$0", "$1" and "$2" have a meaning similar to a shell script. Instead of the zero, first and second argument, they mean the entire current line, the first and second field of the input line, respectively./


** What else can be done 
Cool stuff that you can do now with this new knowledge.

** Resources 
- [[https://www.gnu.org/software/gawk/manual/][Gawk: Effective AWK Programming]]
- [[https://learnbyexample.gitbooks.io/command-line-text-processing/content/gnu_awk.html][Learn by Example - AWK]]
- [[https://www.ibm.com/developerworks/library/l-awk1/][AWK by example - IBM]]
- [[http://www.grymoire.com/Unix/Awk.html][AWK - Grymoire]]
- [[https://www.tutorialspoint.com/awk/][AWK Tutorial - Tutorialspoint]]
- [[http://www.softpanorama.org/Tools/awk.shtml][AWK Programming - Softpanorama]]
